Pepsi

Importing Libraries

Data Exploration

Data Cleaning

Clustering using HushingVectorizer in sklearn

PCA for visualization

WorldClouds

Using TextBlob for Sentiment Analysis

Visualizae Polarity and Subjectivity of Tweets

~~~~~~~~~~~~~~~~~

TF-IDF Vectorizer and Finding the Best K with Elbow Method

Read the Data

Data Preprocessing

Apply Spacy tokenizer, TF-IDF, K-means for the first 100 tweets

Check cosine similarity

Create the clustering table

Elbow Method to determine the best K

Choose K=10 to experiment

Cluster_2 focus on topics related to Russia Vaccine

Cluster 3 contains more rumors and negative reactions